Barq: distributed multilingual internet search engine with focus on Arabic language

نویسندگان

  • Tajje-eddine Rachidi
  • O. Iraqi
  • M. Bouzoubaa
  • A. B. E. Khattab
  • M. E. Kourdi
  • A. Zahi
  • Amine Bensaid
چکیده

♣ This work was supported financially by Alakhawayn University in Ifrane, Morocco under R&D Grant RPF1/2001 and by CoreSoft SARL. * 0-7803-7952-7/03/$17.00  2003 IEEE. Abstract Barq is a distributed multilingual search engine with focus on the Arabic language. The Barq R&D project has involved, over a period of some two years, work on Arabic language processing, Arabic word root extraction, indexing, information retrieval, automatic categorization, focused crawling, distributed computing, distributed database systems, and performance tuning. Barq indexes all documents of the web (and optionally of a particular site) including Word and XML documents that contain at least a single word of Arabic in CP1256, UTF-8, ISO8859_6, ASMO 449 or ASMO 708 code set. The documents themselves can contain other Latin-based characters. This paper focuses on describing the architecture and design patterns of Barq; as well as the various types of search that Barq supports. Issues such as Stemming/Arabic root extraction, indexing, ranking, precision and recall measurements, automatic categorization etc., are presented too, but their details are described in other works.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Model for Multilingual Search Engine

How to find needed information from the web is a critical issue in the Internet. Fortunately, search engines are useful tool to retrieve information from Internet. Although, Internet users speak different languages most of resources are written and published in the English. All Internet search engines provide a lingual search, although some of them enable the searcher to select the language of ...

متن کامل

Multl-Language Text Indexing for Internet Retrieval

We address here the issues associated with indexing multilingual collections of information, as is found for example on the internet. We examine in particular the task of language identiication and the use of stemming algorithms for several European languages. We also present the lessons we have learned from our experience in using the SPIDER information retrieval system as a search engine over...

متن کامل

Image Retrieval Using a Multilingual Ontology

Search engines are among the most useful Internet applications. There exist several media types on the Web and, given the particularities of each of them, adapted search solutions are required. We limit our discussion to image search engines. While rapid and robust, existing image search engines offer results that respond only partially to the user’s queries. An improvement of image search resu...

متن کامل

Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore

Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...

متن کامل

Modern Multilingual and Cross-lingual Information Access Technologies

In this chapter, we describe the state of the art cross-lingual and multilingual strategies and their related areas. In particular, we show a WWW-based information system called MIETTA, which allows uniform and multilingual access to heterogeneous data sources in the tourism domain. The design of the search engine is based on a new cross-lingual framework. The framework integrates a cross-lingu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003